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Abstract 



A capacity-achieving polar coding scheme is introduced for reUable communications over a set of parallel 

channels. The parallel channels are assumed to be arbitrarily-permuted memoryless binary-input output-symmetric 

p^ . (MBIOS) channels. A coding scheme is first provided for the particular case where the parallel channels form a 

'^ [ set of stochastically degraded channels. Next, the case of polar coding for the more general case where the parallel 

r~^ I channels are not necessarily degraded is also considered, and two modifications are provided for this case. 

!>' 
(N 

•^ ' I. Introduction 

O 

^ ' Channel coding over a set of parallel arbitrarily -permuted channels is studied in [il3i . The information message 
in such a setting is encoded into a set of codewords, all with a common block length. These codewords are 
"^ transmitted over a set of parallel discrete memoryless channels (DMC) where the assignment of codewords to 
5_( channels is arbitrary. This assignment is known only to the receiver, which decodes the transmitted message based 



on the set of received vectors. In cases where the capacities of all of the parallel channels are achieved with a 
common input distribution, it is proved in |[T3l that the capacity for the considered parallel setting equals the sum 
of the capacities of the parallel channels. Such parallel channel models are of concern when analyzing networking 
applications, OFDM and BICM systems. 

The coding schemes suggested in [13] for the considered parallel setting are based on random-coding and joint- 
typicality decoding. One of the main contributions of 111 31 is the introduction of a concatenation of rate-matching 
codes with parallel copies of a fully random block code. A rate-matching code is a device that encodes a single 
message into a set of messages. It is shown in |fT3l that under specific structural conditions on the rate-matching 
code, such a concatenated scheme can achieve the capacity of the set of parallel channels. Moreover, it is shown 
how to construct these rate-matching codes from a set of maximum-distance separable (MDS) codes. The decoding 
procedure for the concatenated scheme is based on successive cancellation and joint-typicality (list) decoding. 

This research was supported by the Israel Science Foundation (grant no. 1070/07), and by the European Commission in the framework of 
the FP7 Network of Excellence in Wireless Communications (NEWCOM++). 
Igal Sason is the corresponding author (E-mail: sason@ee.technion.ac.il). 



Polar codes form a class of capacity-achieving block codes 1 1]. These codes are shown to achieve the capacity of 
a symmetric DMC with a practical encoding and decoding complexity (in terms of the block length). Encoding of 
polar codes is defined based on a recursive approach. This recursion is a key ingredient both in proving the capacity- 
achieving properties of polar codes, and their successive-cancellation decoding procedure. A set of predetermined 
and fixed bits are incorporated in the encoding procedure of polar codes, and it plays a crucial role in the decoding 
process. 

Parallel polar coding schemes are provided in this paper for communication over binary-input arbitrarily-permuted 
memoryless and symmetric parallel-channels. The particular case where the parallel channels form a set of stochas- 
tically degraded channels is first addressed. A parallel polar coding scheme is first provided for this particular case. 
While the provided scheme achieves the capacity of degraded parallel channels, it is shown not to achieve capacity 
for the general case where the channels are no longer degraded. Finally, two modifications are provided for the 
general case when the channels are not necessarily degraded. Both of the two modification are shown to achieve 
the capacity for the case at hand. 

The main difference between the proposed coding schemes and the original polar coding scheme in Q, is in the 
setting of the predetermined and fixed bits which are incorporated in the encoding and decoding procedures. In m, 
for symmetric DMC, these bits may be chosen arbitrarily; they are fixed and do not depend on the transmitted 
message. For the provided scheme, some of the concerned bits incorporate an algebraic structure and depend on 
the transmitted message. Moreover, the determination of these bits is based on the structural properties of MDS 
codes, in a manner which relates to the rate-matching code in fT3| . 

This paper is structured as follows: Section JI] provides some preliminary material. The parallel polar coding 
scheme is introduced and analyzed in Section JII] for the particular case of degraded parallel channels. Two modified 
parallel schemes which achieve the capacity for the case of non-degraded channels are studied in Section |IVl 
Section |V] concludes the paper. 



II. Preliminaries 

A. Arbitrarily Permuted Parallel Channels 

We consider the communication model in Figure [1] A message m is transmitted over a set of S parallel 
memoryless channels. The notation [S] = {!,..., 5} is used in this paper. All channels are assumed to have 
a common input alphabet X, and possibly different output alphabets 3^^, s € [S]. The transition probability function 
of each channel is denoted by Ps{ys\x), where y^ G 3^^, s G [S], and x € ^. For the particular case depicted 
in Figure [T] the communication takes place over a set of S" = 3 parallel channels. The encoding operation maps 
the message m into a set of S codewords {x^ € -^"}f^^. Each of these codewords is of length n, and it is 
transmitted over a different channel. The mapping of codewords to channels is done by an arbitrary permutation 
TT : [5] — >■ [S] . The permutation vr is part of the communication channel model, the encoder has no control on the 
arbitrary permutation chosen during the codeword transmission. The permutation vr is fixed during the transmission 
of the codewords. The set of possible S channels are known at both the encoder and decoder. The encoder has no 
information about the chosen permutation. The decoder, on the other hand, knows the specific chosen permutation. 
Formally, the channel is defined by the following family of transition probabiUties: 

{1 oo 
P(Y|X;7r) : Y G {J^i x 3^2 x • • • x 3^5}", X G A'^^", tt : [S] ^ [S] \ 





encoder 


Xl 
X2 
X3 


7T 




channel 1 


— 








X^(l) 


decoder 








m 




channel 2 




X^(2) 












channel 3 




X7r(3) 





















m 



Fig. 1 : Communication over an arbitrarily-permuted parallel channel. The particular case of communicating over S — 3 parallel channels 
is depicted (taken from |13| ). 



where X = (xi, X2, . . . , x^) are the transmitted codewords, Y = (yi, y2, • • • , ys) are the received vectors, 

s 
P{Y\X;7r)=llPs{ysKis)) (D 

s=l 

is the probabihty law of the parallel channels, and vr : [5] — )• [S] is the arbitrarily permutation mapping of codewords 
to channels. 

The coding problem for this communication model is to guarantee reliable communication for all possible (SI) 
permutations vr. This problem is formulated and studied in [131. 

Definition 1 (Achievable rates and channel capacity). Consider coded communication over a set of S arbitrarily 
permuted parallel channels. A rate i? > is achievable if there exists a sequence of encoders and decoders such 
that for all (5 > and a sufficiently large block length n 



- logo M >R-6 
n 



Pe '(n) < 6, for all SI permutations tt : [S] ^ [S] 



(2) 
(3) 



where M is the number of possible messages and Pe ("-) is the average block error probability for a fixed 
permutation vr and block length n. The capacity of the considered model Cu is the maximal achievable rate to 
satisfy © and ©. 

The following theorem may be derived as a particular case of the well-known results on the capacity of the 
compound channel (see, e.g., f3\ and reference therein). Nevertheless, the theorem is stated in the restricted form 
as provided in [il3il : 

Theorem 1 (The capacity of arbitrarily-permutated memoryless parallel channels ||l3l ). Consider the transmis- 
sion over a set of S arbitrarily-permutated memoryless parallel channels. Assume that there is an input distribution 
that achieves capacity for all parallel channels. Then, the capacity Cu satisfies 

s 
Cn = Y,Cs (4) 

where Cg is the capacity of the s-th channel, s € [S\. 



As noted in [131 , if both the encoder and decoder know the actual permutation vr, then the capacity is clearly 



given by X]£i=i ^s', since in the considered channel model the encoder does not know the actual permutation, then 

s 

s=l 

That equality is achievable is proved in |13| using two different approaches: 

1) A random coding argument and a joint-typicality decoding over product channels. This coding scheme is 
based on the notion of product channels as defined in ^. Each possible permutation vr yields a different 
product channel. Consequently, there are S\ possible product channels. A properly chosen random code is 
shown to achieve the capacity Cn under a joint-typicality decoding scheme for all possible permutations vr. 

2) A rate-matching coding scheme that is combined with a random coding argument, and a sequential joint- 
typicality decoding. The construction technique for rate-matching codes in |[T3l . based on MDS codes, 
provided an important intuition for the parallel polar schemes introduced in the following sections. 

B. Polar Codes 

This preliminary section offers a short summary of the basic definitions and results in HI, Q, that are essential 
in the following sections. For a DMC, polar codes achieve the mutual information between an equiprobable input 
and the channel output. It is well known that the information rate under equiprobable inputs is equal to the channel 
capacity of a symmetric DMC. 

Definition 2 (Symmetric binary input channels). A DMC with a transition probability p, a binary-input alphabet 
X = {0, 1}, and an output alphabet y is said to be symmetric if there exists a permutation T over y such that 

1) The inverse permutation 7 ^ is equal to T, i.e., 

T-\y) = T{y) 

for all y &y. 

2) The transition probability p satisfies 

p{y\G) = p{T{y)\l) 

for all y € y. 

Let p be a transition probability function of a binary-input DMC with an input-alphabet X = {0, 1} and an 
output-alphabet y. Polar codes are defined in IH using a recursive channel synthesizing operation which is referred 
to as channel combining. The synthesized channel, after i > 1 recursive steps has a block input of length n = 2* 
bits and is denoted by pn- The output alphabet of the combined channel is y"-. The recursive construction of p„ is 
equivalently defined by using a linear encoding operation. A n x n matrix G„, refereed to as the polar generator 
matrix of size n, can be recursively defined and the combined channel can be shown to satisfy: 

Pn(yiw)=p(y|wG„) (5) 

for all y ey" and w G X". 

Let An ^ [n], and denote by A^ the complementary set of An, (i.e., A^ = [n] \ An). Given a set An, a class 
of coset codes are formed, all with a code-rate equal to ^[^nl- Over the indices specified by An, the components 
of w are set according to the information bits. The rest of the bits of w are predetermined and fixed according 
to a particular code design. The set An is referred to as the information set. Polar codes are constructed by 
a specific choice of the information set An- This construction can be shown to be equivalent to a coset code 



C(Gn (An) ,hGn (An)) whcrc Gn{An) dcnotcs the \An\ X n sub-matrix of G„ defined by the rows of G„ whose 
indices are in An, Gn{A'^) denotes the \An\ x n sub-matrix of G„ formed by the remaining rows in G„, and 

C{G, c) ^ |x : X = uG + c, u G X^\ . (6) 

Channel splitting is another important operation that is introduced in |T| for polar codes. The split channels 
{Pn }f=i, all with a binary input alphabet X and output alphabets y^ x X^^^, I € [n], are defined according to 

Pniy,Mx) = ^^ Y^ p„(y|(w,x,c)) (7) 

where y € y"', w G <Y'^^, and x £ X. The importance of channel splitting is due to its role in the successive 
cancellation decoding procedure that is provided in HI. The decoding procedures iterates over the index / G [n]. If 
/ S An, then the bit wi is a predetermined and known bit. Otherwise, the bit wi is decoded as if it is transmitted over 
the corresponding split channel pn in (|7]l. This decoding procedure is referred in the following as a standard polar 
successive cancellation decoding procedure. It is shown in 1 1 1 that the successive cancellation decoding procedure 
has a complexity of 0{nlogn). 



C. Stochastically degraded parallel channels 

The polarization properties of stochastically degraded parallel-channels are studied in this section. 

Definition 3 (Stochastically degraded channels). Consider two memoryless channels with a common input 
alphabet X, transition probability functions Pi and P2, and two output alphabets J^i and 3^2> respectively. The 
channel P2 is a stochastically degraded version of channel Pi if there exists a channel D with an input alphabet 
3^1 and an output alphabet 3^2 such that 

P2iy2\x) = Y Pi{yi\x)D{y2\yi), Vx G ;f,y2 e 3^2- (8) 

Lemma 1 (On the degradation of split channels). Let Pi and P2 be two transition probability functions with a 
common binary input alphabet X = {0, 1} and two output alphabets 3^i and 3^2> respectively. For a block length 
n, the split channels of Pi and P2 are denoted by P|^ and P2„, respectively, for all / G [n]. Assume that the 
channel P2 is a stochastically degraded version of channel Pi. Then, for every / G [n], the split channel Pg ;^ is a 
stochastically degraded version of the split channel P} „. 

Proof: The proof follows by induction [Si. 



Definition 4 (Stochastically degraded parallel channels). Let {P^jf^i be a set of S parallel memoryless channels, 
and denote the capacity of Ps by Gs for all s G [S]. In addition, assume without loss of generality that Gg > Gg' 
for all 1 < s < s' < 5. The channels {P^jf^i are stochastically degraded if for every 1 < s < s' < 5 the channel 
Ps' is a stochastically degraded version of Pg. 

Corollary 1 (On monotonic information sets for stochastically degraded parallel channels). Consider a set of 
S memoryless degraded and symmetric parallel channels {Ps}f=i, with a common binary-input alphabet X. For 
every s G [5], denote the capacity of the channel Pg by Gg, and assume without loss of generality that 

Gi>G2>--->Gs. 



Fix < /? < I and a set of rates {i?s}f^]^ where 

0<Rs< Cs, Vs G [S]. 

Then, there exists a sequence of information sets An ^ [^], s S [S] and n = 2* where i G N, satisfying the 
following properties: 

1) Rate: 

lAl^^ynRs, yse[S]. (9) 

2) Monotonicity: 

^(f^c^^-Dc-.-c^i). (10) 

3) Performance: 

Pr(^K^.)) < 2-"' (11) 

for all / G Ai'^ and s G [S"], and 

^/(P) = {pi'^(y, w('-^)|u;0 < p«(y, w('-^)|u;; + 1)} , / G [n] (12) 

Proof: The rate and performance properties form immediate consequences of the polarization properties in f2\. 
It is left to prove that the choice of the information set sequences can be made such that the monotonicity property 
in (ITOb is satisfied. Start with s = S. From ||2l it follows that there exists a sequence of sets {An } satisfying Q 
and (ITTI ). Next, fix an s' G [S] and assume that for all s > s', the set sequences {^If } can be chosen such that 
the properties in ^ and (fTTI) are satisfied, and in addition 

^(f)c4^-i)c...cA(f'+i). (13) 

If s' = S then (IT3] ) is satisfied in void. The existence of the sequence {An } satisfying © and (ITTT) is already 
provided by the polarization properties in [2J. It is left to verify that the set sequence can be chosen such that the 
monotonicity property 

Ai''+^^ C ^(f') (14) 

is kept. Choose an arbitrary index I G An ■ It is proved that this index corresponds to the information set for 
the channel Pgi. Specifically, the performance property in (fTTI ) is satisfied for s = s' . Since Pg'+i is a degraded 
version of P^s then according to Lemma [H the split channel Pgij^in is a degraded version of the split channel 
Pg, n- It is clearly suboptimal to first degrade the observation vector y G 3^^' to create a vector y G ys'+i, and 
only then detect the input bit x for the degraded split channel. However, the detection error event for the degraded 
split channel Pg,j^i „ satisfies the upper bound in (fTTI ). As a result, the optimal detection error for the better split 
channel P),, ^ must also satisfy (fTTI) . Hence, all the indices in An can be chosen for the set An ■ The rest of 
indices are chosen arbitrarily out of the set of possible indices whose existence is guaranteed by the polarization 
properties. The proof follows by induction. ■ 

Remark 1. On good indices for stochastically degraded channels In Corollary |TJ the existence of a monotonic 
sequence of information sets is proved for a degraded set of channels. A subtle inspection of the proof shows 
that the choice of the monotonic sequence of sets can be carried sequentially. First, the information set of the 
worst channel is specified. Then, as is shown in (fT4|) . all the indices that are "good" for the worse channel, are 
also "good" for the better channel. Here "good" is in the sense that the corresponding Bhattacharyya constants 
of the split channels (which form upper bounds on the corresponding decoding error probability) can be made 



exponentially low as the block length increases. Consequently, all that is left to specify are the rest of the "good" 
indices for the better channel (which are "not good" for the worse). The construction then follows sequentially. 

Remark 2. Under the assumptions in Corollary [T] the capacity Cg for each of the channels in {Ps}f=i is achieved 
with equiprobable inputs. In cases where the parallel channels are not symmetric, a similar result can be shown 
where the capacities are replaced with the mutual information obtained with equiprobable inputs. 

D. MDS codes 

In this section some basic properties of MDS codes are provided. For complete details and proofs, the reader is 
referred to [4] and [6]. 

Definition 5. An (n, k) linear block code C whose minimum distance is d is called a maximum distance separable 
(MDS) code if 

d = n-k + l. (15) 

Remark 3. The RHS of (fTSl ) is the Singleton bound on the minimum distance of a linear block code. 

Example 1 (MDS codes). The (n, 1) repetition code, (n, n — 1) single parity-check (SPC) code, and the whole 
space of vectors over a finite field are all MDS codes. 

The following properties of MDS codes are of interest in the continuation of this paper: 

Proposition 1 (On tiie generator matrix of an MDS code). Let C be an MDS code of dimension k. Then, every 
k columns of the generator matrix of C are linearly independent. 

Corollary 2. Every k symbols of a codeword in an MDS code of dimension k completely characterize the codeword. 

Let 5 > be an integer number and fix an integer tti > such that 2"^ — 1 > S. In the following, we explain 
how to construct an MDS code of block length S and a dimension fee [S\. For every k G [2™ — 1], there exists 
a (2™ - 1, k) RS code over the Galois field GF(2™). Every RS code is an MDS code El Proposition 4.2]. Two 
alternatives are suggested: 

1) Shortened RS codes: Consider an (2"^ - 1, k) RS code over the Galois field GF(2'"). Deleting 2"^ - 1 - S" 
columns from the generator matrix of the considered code results in an {S, k) linear block code over the 
same alphabet. The resulting code is an {S,k) MDS code over GF(2™). 

2) Generalized RS (GRS) codes: GRS codes are MDS codes which can be constructed over GF(2™) for every 
block length S and dimension k (as long as 2™ — 1 > S*). 

Remark 4 (On the determination of codewords in RS and GRS codes). Our main interest in MDS codes is due 
to CoroUay |2] This property is even more appealing for the case of RS or GRS codes because the determination 
of a codeword in RS or GRS codes is based on a polynomial interpolation over finite fields (see, e.g., ||6l p. 151]). 

III. The Proposed Coding Scheme (degraded channels) 

In this section, two parallel polar coding schemes are provided for a set of binary-input, memoryless, degraded 
and symmetric parallel channels. First, the simple particular case of S = 3 parallel degraded channels is studied in 
Section IIII-AI Next, the intermediate case of arbitrary number of degraded channels is introduced and studied in 
Section UlLBl 



A. Parallel polar coding for S = 3 degraded channels 

Assume that a parallel coding scheme is applied for communication over a set of 3 parallel channels Pi, P2, 
and P3, whose capacities are Ci > C2 > C3, respectively. According to Theorem [H the capacity Cu in this case 
satisfies 

Cn = Ci + C2 + C3. 

Fix the rates Ri > R2 > R3, satisfying Rg < Cg for all s e [3], and let 

R = Ri + R2 + Pa- 
in the following, a parallel polar coding scheme of rate P is described that achieves reliable communications. 
Therefore, the proposed scheme achieves the capacity Cn by selecting the rates Pi, P2, and P3 to be close, 
respectively, to Ci, C2, and C3, and satisfy the above condition for the rate triple. 

Let {^n } be the information set sequences as in Corollary [T] Fix a block length n, let 

and 

k = ki + k2 + ks. 

The encoding of k information bits to 3 codewords: xi, X2, and X3 is defined. First, the information bits are 
arbitrarily partitioned to three groups of sizes ki, k2 and ^3. Next, the encoding of the first two codewords is 
performed as follows: 

• The fei information bits used to encode xi are (arbitrarily) partitioned into three subsets: ui^i G X^^, ui^2 G 
Xk,-k,^ and u,. e x'''->'\ 

• The k2 information bits used to encode X2 are (arbitrarily) partitioned into two subsets: U2,i G X''^ and 
U2,2 £ X^^~^\ In addition, u^ (used for encoding xi) is also involved in the encoding of X2. 

• The codewords xi and X2 are defined similarly to the case of S* = 2 parallel channels. Specifically, in terms 
of coset codes: 



XI = ui,iGn (^(f)j + ui,2G„ (^(,2) ^ ^(3) j ^ ^^^^ (■^a) \ ^(2) j + bGfe ([n] \ A^'^j (16) 

X2 = U2,lGn [A^^^) + U2,2G„ (4^) \ ^(3)^ + ^^Qn (^'^ \ ^f ) + bG, ([n] \ 4^)) (17) 

where b G A'"^'^ is a predetermined and fixed vector. 

The encoding of the codeword X3 is based on the remaining k^ information bits, denoted by U3 G X^\ In 
addition, the information bits in ui 2, U2,2 and u^ are also involved in the encoding of X3: 

X3 = U3a„ (aI^A + (Ui,2 + U2,2) G„ (A^^ \ A^^A + U,G„ (A^^^ \ A^^A + hGn ([n] \ A^) . 



(2) 

Note that the repetition approach is also done for the indices in [n] \ An ■ However, a different approach is applied 
to the indices in An \ An ■ The bits corresponding to these indices are set using a symbol- wise parity-check of 
ui_2 and U2,2- 

The order of decoding the information bits for all possible assignments of codewords over a set of three parallel 
channels is provided in Table H The decoding starts with the channel Pi with the maximal capacity Ci. Irrespectively 
of the actual codeword that is transmitted over Pi, the bits which correspond to the indices in An are decoded 
using the standard polar successive cancellation decoding. The decoded bits depend on the actual codeword which 



Channel Pi 


Channel P2 


Channel P3 


Transmitted 
Codeword 


Decoded 
Information 


Transmitted 
Codeword 


Decoded 
Information 


Transmitted 
Codeword 


Decoded 
Information 


Xl 


Ui,i, Ui,2, Ur 


X2 


U2,l, U2,2 


X3 


U3 


X3 


U3, Ui,2 + U2,2 


X2 


U2,l 


X2 


U2,l, U2,2, Ur 


Xl 


Ul,l, Ui,2 


X3 


U3 


X3 


U3, Ui_2 + U2,2 


Xl 


"1,1 


X3 


U3, Ui_2 + U2,2, Ur 


Xl 


Ul,l, Ui,2 


X2 


U2,l 


X2 


U2,l, U2,2 


Xl 


"1,1 



TABLE I: The order of decoding the information bits for all possible assignment of codewords over a set of three 
parallel channels 



is transmitted over Pi. Next, the decoding proceeds to process the vector observed at the output of the channel P2, 

whose capacity is C2. The decoding of \An \ information bits is established in this decoding step. Note that for a 

(2) 
standard successive cancellation decoding procedure, n — \J\}n \ predetermined and fixed bits are required for proper 

operation. For the case at hand, these bits are not all predetermined and fixed. The vector b is predetermined, but 

the rest depends on the repetition bits Uj. Since the bits Ur were decoded at the previous decoding stage (based on 

the observation vector of Pi), they can be treated as if they are predetermined and fixed for the decoding of X2. 

(2) 
Consequently, \An \ information bits are decoded (depending on the actual codeword transmitted over the channel 

P2). Finally, the decoding proceeds for the vector received at the output of the channel P3. As in the previous 

decoding steps, the polar successive cancellation decoding is applied where the bits corresponding to the split 

channels indexed by [n] \ Jvn are not all predetermined and fixed (as in contrast to the standard single channel 

case). Nevertheless, these bits can be all determined using the information bits decoded in the two first steps. 

The bits in b are predetermined and fixed. The repetition bits in Ur are already available after the decoding of 

the information transmitted over Pi . The rest, can be evaluated by taking a bit-wise exclusive-or (xor) of the bits 

decoded in the two previous steps. As an example, a combination shown in Table U is described explicitly. Consider 

the case where the codeword X2 is transmitted over the channel Pi, and the codeword X3 is transmitted over the 

channel P2. At the first decoding step, the vectors U2,i, U2,2 and Ur are decoded (where the predetermined bits 

refer to the vector b). Next, the vectors U3, and ui^2 + U2,2, are decoded (the pretermitted bits for this decoding 

stage refer to b and u^). After this stage, the information bits ui 2 can be determined by U2,2 + (ui,2 + U2,2)- 

Moreover, the information bits ui 2 are used for the last decoding stage as predetermined and fixed bits (together 

with the vectors u^ and b). After the last decoding stage the vector uii is decoded, and the decoding of all the 

information bits is completed. 



B. Parallel polar coding for 5 > 3 degraded channels 

B.l. Encoding 

A parallel polar encoding is described for the general case. The technique used for rate-matching encoding in llT3l 
is incorporated in the current case as well. This technique is based on MDS codes, in particular (punctured) RS 
codes are used in [13| for rate splitting. As commented in Section Hl-DI GRS codes can also fit for the provided 
construction. A set of 5 — 1 MDS codes over the Galois field GF(2'"), all with a common block length S are 



(k) 

MDS' 



chosen (either by puncturing an appropriate RS code or using GRS codes). These codes are denoted by C, 
k G [S — 1], where the code C^^^s ^^^ dimension k. 

Let {P^lf^^ be a given set of memoryless degraded and symmetric parallel channels, whose capacities are 
ordered such that Ci > C2 > • • • > Cg. Let {An }g^i be the information index sets satisfying the properties in 



10 

Corollary [TJ for a block length n and rates Ri> R2> ■ ■ ■ > Rs, Rs < Cs, s ^[S]. Define 

and 

^5+1 = 0. 

In addition, it is assumed for the purpose of simplicity that n and kg for all s € [S], are integral multiples of m. In 
the provided coding scheme, k = X]s=i ^s information bits are encoded into S codewords x^, s G [S]. As the rates 
Rs, s € [S] can be chosen arbitrarily close to Cg, respectively, the capacity Cn in (|4]) is shown to be asymptotically 
achievable (the error performance is considered in Section IIII-BI ). 

Prior to the stage of polar encoding, the k information bits are first mapped into a set of binary vectors 



U 



{u,,, G A:'^^-'+^-'=^-'+^ : s,/G[S]}. 



The S-ks bits in the vectors u^ i, s G [S] are plain information bits, chosen arbitrarily from the set of k information 
bits. The vector set 

C2 ={us,2 = (ns,2(l),'U.s,2(2),... ,Us^2{ks-i -ks)) ■■ S G[S -I]} 

are also filled with plain information bits, chosen arbitrarily from the set of remaining k — S ■ ks information bits 
(note that under the above assumptions k — S ■ ks > 0). Next, the vector U5^2 is determined (the following steps 
are accompanied with the illustration in Figure |2l): 

1) Each vector in C2 is rewritten as a row vector of a matrix over GF(2™) (this step is illustrated in Figure |2] 
where each vector is represented with a horizontal rectangle). Each m consecutive bits are mapped into a 
symbol over GF(2™). This results in the {S - 1) x Ks-i,s matrix over GF(2'") 

C(2) = (d?), ^e[S-l], j€[Ks-i,s] 



where 

r^ A ks-i - ks 

J^S-l,S = • 

m 

(2) 
The element C^- is the symbol over GF(2™) corresponding to the binary length-?n, vector 

^Ui,2((i - l)m+ l),Ui,2((i - l)m + 2),...,,Ui,2(jm 

where i £ [S — I] and j G [Ks-i,s]- 

2) Each one of the columns of C^^^ are considered as the first S — 1 symbols of a codeword in the code 
^MDS • These columns are illustrated with dashed vertical rectangles in Figure |2] Consequently, these columns 
completely determine the codewords 

{cj : j G [Ks-i,s]} 

in the MDS [S, S - 1] code C^mds^\ 

3) A length-i^5_i 5 vector U52 over GF(2'") is defined using the last symbol of each of the codewords Cj, 
j G [Ks-i^s], evaluated in the last step. Each of these symbols is illustrated as a filled black square in 
Figure |2l 

4) The vector U52 is defined by the binary representation of the vector £15 2 where each symbol over GF(2'") 
is replaced by its corresponding binary length-m vector. 
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Ul,2 



U2.2 



C(2) 



US"-1,2 



U5.2 



,Ci 

Fig. 2: Illustration of the construction of the vector \is.2- The vectors u^,.,, fc G [5 — 1] defining the matrix C'^' are shown, along the 
columns defining the codewords Cj, j G [Ks-i,s] in Cj^us 



The definition of tlie remaining vectors in U continues in a similar way. Let 2 < I < S, and assume that the 
vectors Ug ^ are already defined for all s G [S] and /' < /, based on 



^{S -{s- l)){ks-(s-l) - ^S-(s-2)) 



s=l 



information bits (from a total of k information bits). The construction phase for the vectors u^ ;, s G [S] is defined 
as follows: 



1) The binary vector set 



Ci = {usy. l<s<S~{l-l)} 



are filled with 



{S-{1- 1)) {ks-{i-i) - ks-(i-2)) 



arbitrarily chosen information bits, out of the remaining 

I' 

k-J2i^ ~^^~^^) {^S-{s-l) - ks-(s-2)) 
s=l 

information bits. 

2) Each vector in Ci is rewritten over GF(2'") as a row vector in an {S — {I — 1)) x i^5_(;_i) 5_(;_2) matrix 

over GF(2'") 

^(0 = (^0) 



where 



K 



S-(/-l),S-(/-2) 



A ^S^{1-1) - ks^(l~2) 



m 
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and C^ •, i G [S" — (/ — 1)], j € [Ks-{i-i)^s-(i-2)]' equals the symbol in GF(2™) corresponding to the binary 
length-m vector 



Ui,«((i - l)m+ l),Ui^i{{j - l)m + 2),...,Uj,2(j 



m] 



3) Each column in C/ is a vector of S" — (/ — 1) symbols over GF(2'"). Hence, it completely determines a 
codeword Cj = (9,1, Cj,2, • • • , 9,5), j £ [i^5-(i-i),5-(i-2)]> in the MDS [S, S - {I - 1)] code C^^ds-^'^'. The 
columns of Ci are considered as the first S — {I — 1) symbols of a codeword in the code Cjl^pg 

4) Evaluate the remaining symbols for each of the codewords Cj, j € [^s-(/-i),S-(/-2)]- 

5) The length-/i:5_(,_i)^5_(,_2) vectors Us,i = (usA'^), ■ ■ ■ ,Us,iiKs-{i-i),S-{i-2))), s > S - {I - 1), over 
GF(2'") are defined using the codewords Cj, j e [-fC5„(/_i)^5_(/_2)] according to 

6) For every s > S — {I — I), The vector Us^i is defined to be the binary representation of the vector u^ / (where 
each symbol over GF(2'") is replaced with its binary length-?7i vector representation). 

The parallel polar codewords are defined using the coset code notation. Specifically, the codewords x^, s £ [S], 
are defined according to 

s 
X, = J] u,,,G„ [j^t'^'-'^^ \ A^f^'-'^^) + bG„ (n \ ^(,1)) , se[S] (18) 

1=1 

,{S+1) A 



where An = and b S ^" ^^ is a binary predetermined and fixed vector. 



B.2. Decoding 

The decoding process starts with the observations received at the output of the channel Pi whose capacity is 
maximal. Assume that the codeword x^-i(i) is transmitted over Pi. A polar successive cancellation decoding, with 
respect to the information index set An , is applied to the received vector. This allows the decoding of the vectors 
U7r-i(i),i» I G [5"] (as if they are the information bits of the considered polar code). If 7r~^(l) = 1, then indeed all 
the vectors u^-i(i-) ^ = ui ;, / € [S] are information bit vectors. Generally, only a subset of these vectors comprise 
of information bits, the rest are coded binary representation of coded symbols of the chosen MDS codes. 

At the second stage, the decoding of the received vector over P2, which denotes probability transition of the 
channel with the second largest capacity, is concerned. Assume that the codeword x^-i(2) is transmitted over P2. 
A polar successive cancellation decoding is used. This decoding procedure is capable of decoding \An \ bits based 
on n — \An \ predetermined and fixed bits. For the current decoding procedure, n — \An \ of these bits are the 
predetermined and fixed bits in b. The rest of \An \ — \An \ bits are based on the bits decoded at the previous 
decoding stage. Specifically, the bit vector u^-i(2),s can be evaluated using the bit vector u^-i(i) 5. Recall that 
u^-i(2),s is th^ binary representation of u^- 1(2), 5- Moreover, each of the symbols of u^-i(2),5 belongs to a codeword 
in the [S, 1] MDS code C^^s- These codewords are fully determined from the vector U7r-i(i),5 as follows: 

1) Rewrite the vector u^-i(i) 5 over GF(2'") where each consecutive m bits are rewritten by the corresponding 
symbol over GF(2™). Denote by 

U7r-i(l),S = (""77-1(1), 5(1)5 • • • 5 ^7r-i(l),5(-^l,2)) 

the resulting length-ifi 2 vector over GF(2'"). 
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2) For each symbol Utj-i(i),s{J)' J € [^1,2], find the codeword 

Cj = (cj,i,...,Cj,s) G Cj 



(1) 

MDS 



whose 7r~^(l)-th symbol satisfies Cj,r-i(i) = ^7r-i(i),5(j)- These codewords are fully determined by the 
considered symbols. 

3) Define the vector 

U7r-i(2),S' = (^'7r-i(2),s(l)> • • • i ^7r-i(2),5(-^l,2)) 

according to u^-i{2),s{j) = Cj,7r-i{2) for every j G [Ki^2]- 

4) The vector 

is set to the binary representation of u^-i(2)^5. That is, the bits ti7r-i{2),s((i ~ 1)"^ + 1); • • • )^7r-i(2),s(i"^) 
are the binary representation of the symbol Ut^--^{2),s{3) ^ GF(2™), j G Ki2- 

With both b and u^-i(2),5 as predetermined and fixed bits, the polar successive cancellation decoding can 
be applied. Consequently, after the second decoding stage, all the S binary vectors u^-i(2),s> s G [S], are fully 
determined. Moreover, based on the codewords Cj, j G [^i,2]> the vectors u^-i(5) 5, are fully determined for all 
s > 2 as well. 

Next, the remaining S — 2 decoding stages are described. It is assumed that after the {s — l)-th decoding stage, 
where 2 < s < S, the vectors u^-i(^/)^ for either 1 < s' < s and / G [S], or s' > s and S — s + 3<l<S, were 
decoded at previous stages. At the s-th stage, the decoding is extended for the vectors u^-i(^s),i for all / G [S] and 
the vectors u.„-i(s'),s-s+2 for all s' G [S]. 

In order to apply the polar successive cancellation decoding procedure to the vector received over the channel 
Ps, the bits in b and {ut^-^{s),i}i>s~{s~2) must be known for the procedure. The vector b is clearly known. In 
addition, the bits in {u7r-i(s)^i}/>5_(s_3) are already decoded in previous stages. It is left to determine the bits in 
U7r-i{s),s-(s-2)- These bits are determined in a similar manner as in the decoding stage for s = 2, where the vector 
u^-i(2),s is determined. Moreover, the determination of u^-i(s),5-(s-2) is established along with the determination 
of u^-i(s'),5_(s_2) for all s' > s, in the following way: 

1) The binary vectors u^-i(^gi-^^s-s+2 for s' < s are already decoded at previous stages. Rewrite these vectors 
over GF(2'") where each consecutive m bits are rewritten by the corresponding symbol over GF(2'"). Denote 
the set of resulting vectors by 

^ = {^n-^{s'),S~s+2 = {Utt-^{s'),S~s+2{'^), ■■■, U^-^{s'),S~s+2{Ks~l,s)) ■ s' < s} . 

2) The set V completely describes Kg^i^s codeword Cj = {cj^i, . . . , Cj^s), j G [Kg-i^s], all in the code C^qj 
and satisfy the constraints: 

Cj,7T-Hs') = K-Hs'),S-s+2{j), 1<S' <S. (19) 

3) Define the vectors 

^7T-^s'),S-s+2 = (^*7r-i(s'), 5-5+2(1)5 • • • ^U^-^s'),S-s+2{Ks-l,s)) 

for all s' > s by 

4) The vectors Uj^-i(s'),s~s+2 are determined for all s' > s by the binary representation of u^-i[s'),s-~s+2- 
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Based on successive cancellation at the current decoding stage, the kg bits corresponding to the information set 
JVn are decoded. This completes the decoding of all the binary vectors u^-i(s) ; for / G [S]. 

Remark 5 (On channels with equal capacities). The case where for an index s' G [S], Cg- = Cg'+i is treated by 
skipping the construction of Cg' ■ The coset codewords are defined by 

s'-l 
1=1 

l=s'+2 



At the decoding stage, two consecutive polar successive cancellation decoding can be performed for both vectors 
received at the output of the channel Pg' and -Pj'+i. 

B.3. Capacity-approaching property 

Theorem 2. The provided parallel coding scheme achieves the capacity of every arbitrarily-permuted memoryless 
degraded and symmetric set of parallel channels. 

Proof: Consider a set of S arbitrary -permuted degraded memoryless parallel channels Pg, s € [S], whose 
capacities are Cg, s £ [S], respectively, and assume that the channels are ordered so that 

Ci>C2>--->Cs. 

According to Theorem [T] the capacity Cn for the considered model is equal to the sum in (|4]). For a rate R < Cn, 

choose a rate set {i?s}fL^ satisfying 

S 

Rg<Cg, ^Rg>R. (20) 

s=l 

The parallel polar coding in Section IIII-BI is considered. The rate of the proposed scheme is given by 

s 



s=l 



From (|9l) and (l20l ). it follows that the proposed scheme can be designed to operate at every rate below capacity. It is 
left to prove that the block error probability of the proposed scheme can be made arbitrarily small for a sufficiently 
large block length. 
Consider the vectors 

ng,i, s,le[S] (21) 



in (118]) . These vectors include all the information bits to be transmitted (in addition to coded versions of these 
bits). These vectors are determined either via the successive cancellation decoding procedure of the polar codes, 
or determined by the MDS code structure applied in the parallel scheme. The successive cancellation decoding 
procedure is based on detecting the input to the set of split channels Pg^l where s € [S] and / G An . The 
information bit corresponding to a split channel Pg^n, is denoted by a^,;. Note that the bit Ug^i is either determined 
by the successive cancellation decoding procedure for polar codes, or else determined by the codeword of an MDS 
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code for which it belongs to. In cases where the bit ag^i is decoded via a polar successive cancellation decoding 
procedure, the decoded bit is denoted by as,i- 

The bits decoded via polar successive cancellation decoding procedure, based on the received vector at the output 
of the channel Ps, s € [S], are 

as,i, I G Ai'l (22) 



Note that the bits in (|22l) do not include all the bits in (I2TI) . Nevertheless, the rest of the bits in (|2Tl) are fully 
determined from the decoded bits in (l22l ) based on the MDS code structure (as detailed in the previous section). 
Assuming that a permutation tt is applied to the transmission of codewords, define the events 

for all s' <s,l' < 1} 

where s G [S] and I G An ■ Since all the information bits can be fully determined from the bits in (l22l ). the 
conditional block error probability is given by 

where m is the transmitted message (representing the k information bits). The events £i{Ps) for s E [S] and 
I € An , defined in (IT2l ). can be shown to be independent of the transmitted message |iJj. Moreover, it follows that 

J^s^l ^ £l{Ps). 

Consequently, the average block error probability is upper bounded using the union bound according to 

Finally, plugging the upper bound on the error probability (ITTI ) into (1231 ). assures that for every fixed 5 > 0, the 
block error probability can be made arbitrarily low as the block length increases. ■ 

Remark 6 (On symmetry condition for the applied coding scheme). In order to use the result in (fTTI) for the 
proof of Theorem |2l we rely on the symmetry result in [IJ. Specifically, it is shown in fTl| that for symmetric 
channels according to Definition |2j the error performance of the polar coding successive cancellation process is 
independent on both the information bits and the predetermined and fixed bits. This result is of particular importance 
for our scheme as the predetermined and fixed bits of the channel polarization method are not predetermined and 
fixed in our scheme. 

IV. Parallel Polar Coding for Non-Degraded Parallel Channels 

In this section, a parallel polar coding scheme is provided for transmissions over non-degraded parallel channels. 
With the introduction of non-degraded channels, the property which must be relaxed is the monotonicity of the 
information sets in (fTOl ). Consequently, a proper modification must be introduced. In fact, it is the ordering of the 
successive cancellation process which is found to be the key ingredient in dealing with the non-degraded case. That 
is, the decoding is not carried channel-after-channel as in Section JIIJ but for each bit index a different ordering of 
channels is applied. If the decoding order is kept channel-after-channel, it can be shown that the decoding method 
presented in Section In] can not achieve capacity. In particular, an upper bound on the capacity of the coding method 
in Section |lll] is first provided. Next, two alternative coding schemes with modified ordering are presented. 
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A. Upper bound for channel-after-channel ordering 

A.l. Signaling over Parallel Erasure Channels 

The following proposition, provided in lfT2l . considers the Bhattacharyya parameters of the split channels: 

Proposition 2 (On the worst Bhattacharyya parameter HH). Let p be a binary-input memory less output- 
symmetric channel, and consider the split channel pn where / G [n]. Then, among all such binary-input memory less 
output-symmetric channels p whose Bhattacharyya parameter equals B, the binary erasure channel has the maximal 
Bhattacharyya parameter B{pn ), for every / G [n]. 

The proof of Proposition |2] is based on a tree-channel characterization of split channels, in addition to an argument 
which is related to extremes of information combining. Based on Proposition |2j a polar signaling scheme is provided 
in IIT2I for reliable communication in a compound setting. A similar technique is used in the following for the 
parallel channel setting. 

Consider the parallel transmission model in Section III-AI In this section, it is assumed that the parallel channels 
are binary-input memoryless and symmetric, but are not necessarily degraded. We further assume, without loss of 
generality, that the set of parallel channels {-Ps}sg[S]' ^^^ ordered such that 

B{Pi) < B{P2) < . . . < B{Ps) 

where B{Ps) is the Bhattacharayya parameter of the channel Pg, s G [S] (note that the Bhattacharyya parameter 
varies from to 1 with the extremes of zero and one for a noiseless and completely noisy channels, respectively). 
Next, consider the set of parallel binary erasure channels, {(5s}s(=[5] where the erasure probability of the channel 5s 
equals B{Ps), s G [S]. These erasure channels form a family of S stochastically degraded channels. Consequently, 
based on Theorem|2j the parallel polar coding scheme in Section ITlI-B I achieves a rate of 5 — X]s=i B{Ps) over the 
set of erasure channels, under the successive cancellation decoding scheme detailed in Section ITlI-B I The following 
corollary addresses the performance of the same coding scheme over the original set of parallel channels: 

Corollary 3. The polar coding scheme for the parallel erasure channels, operates reliably over the original parallel 
channels. 

Proof: The suggested coding scheme performs reliably over the parallel binary erasure channels. The decoding 
process, as described in Section ITlI-BI includes a sequence of successive cancellation decoding operations applied to 
the polar codes over each one of the parallel channels. As shown in the proof of Theorem |2j reliable communication 
is obtained based on reliably decoding each of the successive cancellation operations. It is therefore required to 
show that the successive cancellation over the original channels {-Ps}se[5] can also be carried reliably, this follows 
as a consequence of Proposition [2] Denote the sequences of information sets chosen for reliable communication 
over the erasure channels {5s}se[5] ^'y {^n }se[5i- Fix an arbitrary channel Pg from the set of parallel channels, 
and an arbitrary index / G An . Consider next the error event £i{Ps) in (IT2l ). According to f[\, this error event is 
upper bounded by 

Pr{£i{P,))<B{{P,)(i)) (24) 

where B[{Ps)n ) denotes the Bhattacharayya parameter of the split channel {Ps)n . From Proposition |2l it follows 
that 

B{iPs)^^) < B{{6s)^^) (25) 

where B(^{6s)n ) is the Bhattacharayya constant of the split channel {6s)n ■ Fix < ;5 < ^ as in f2l. From (l24l ) 
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and (|25l), it follows from S that 

FT{SiiPs))<2-^'. 

Consequently, the successive cancellation decoding operations can be carried rehably for each one of the original 
channels, which completes the proof. ■ 

A.2. A Compound Interpretation of Monotone Index Set Design and Related Results 

The parallel coding scheme provided in Section Hm is based on a monotonic sequence of index sets {^n }se[5] 
satisfying the conditions in Corollary [T] As explained in Remark [T] the index sets in An , s € [S] are 'good' for 
all the channels Pg', s' > s. Here, as in Remark [T] 'good' means that the corresponding Bhattacharayya parameters 
of the corresponding split channels satisfy the polarization properties studied in m, (21. The index set sequences 
{>4^n }se[S] ^rs applied in this paper to parallel transmission. Even though the compound setting and the problem 
of parallel transmissions are, at first glance different, the actual problem of finding an index set which is 'good' 
for a set of channels is similar to the problem studied in fT2)\ in the compound model. 

In the compound setting, the transmission takes place over one channel which belongs to a predetermined 
set of channels. It is assumed in the current discussion that (only) the receiver knows the channel over which the 
transmission takes place. If a polar code is applied in such a compound setting, then a suitable index set is required. 
Such an index set must be 'good' for all the channels in the set. The maximal rate over which such a polar coding 
scheme performs reliably is termed as the compound capacity of polar codes. Obviously, the compound capacity 
relates to the size of possible 'good' index sets. 

Upper and lower bounds on the compound capacity of polar codes under successive cancelation decoding are 
provided in 111 211 . These bounds are defined using the notion of tree-channels. Let p be a binary-input memoryless 
output-symmetric channel. For a binary vector of length k, a = (o"i, a2, ■ ■ ■ ,o'k), the tree-channel associated to a is 
denoted by p'^. The actual definition of the tree-channel is not required for the following discussion, and is therefore 
omitted (the reader is referred to |fT2l| and references therein for more details). It is noted that the tree-channel is 
also binary-input memoryless and output-symmetric. Moreover, it is further noted in |[T2l that the tree-channel p'^, 
is equivalent to the split-channel pn where a is the binary expansion of /. 

Let {-Ps}se[5] t'e a set of S binary-input memoryless output-symmetric channels. It is shown in |[T2l that the 
compound capacity for the considered setting C (^{Ps} se[s]) is lower bounded by 

C{{Ps}se[s])>l-^ E fJ^^B{P^) (26) 

where k £ N and B{P^^ is the Bhattacharyya parameter of the tree-channel P" . Moreover, this lower bound is 
a constructive bound. That is, the construction of an appropriate index set sequence An{{Ps} s&\s]) i^ inherent 
from the lower bound. The polar code corresponding to this index set has an asymptotically low decoding error 
probability under successive cancellation decoding (for every channel in the set {i^sjsGls])- 

Remark 7 (On the derivation of (|26l)). The actual derivation in |[T2l| is provided for two channels P and Q. 
Nevertheless, the arguments in |IT2l are suitable for the case of 5 > 2 channels. The proof of the bounds in |IT2l is 
based on two major arguments. The first argument consider a sequential transformations of a given channel P to 
a sequence of sets of tree-channels. Initially, the channel P is transformed into a pair of tree-channels P^ and P^. 
Next, each of these tree-channels is transformed again to another pair, and the transformation repeats recursively. 
It is shown that instead of transmitting bits corresponding to indices induced by the polarization of the original 
channel P, at each transformation level k, the problem is equivalent to transmitting a fraction ^ of the bits based 
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on the indices induced by the polarization of the corresponding tree channels {P'^}cre{o,i}''- The first argument 
is therefore not affected by the number of channels (as it concerns a property of a single channel). The second 
argument is identical to the more simpler polarization scheme detailed in Section IIV-AI This polarization scheme, 
based on binary erasure channels, can be applied to every set of tree-channels {Pg}^^^, a G {0, 1}*^. Based on 
this polarization scheme, a rate of ^ (l — maXsg[5] i?(Pf )) is guaranteed for each a € {0, l}'^. 

Corollary 4 (Improved parallel polar coding scheme). Consider the transmission over a set of parallel binary- 
input memoryless and output-symmetric channels {-Pslsgr^]. Fix an order Pg^ , P^^ , • • • , Pss of channels and A; € N. 
Then, reliable transmission is achievable based on the parallel polar coding scheme in Section JIIJ whose rate is 
given by 

se[S'-i]<je{o,i}'=' ^^'"' ' 

Proof: Define the channel sets 

Vs^{Ps.}L, se[S]. 

For each channel set Vs, s G [S], the compound setting is considered. Based on the lower bound in (l26l) and its 
associated index set sequence, a set sequence An{Vs) exists for every s G [S], such that 

cre{o,i}*^ 
and reliable decoding is guaranteed for all the channels in the set Vg under successive cancellation decoding. As 
an immediate consequence of the construction, for every n, the index sets form a monotonic sequence (i.e., if an 
index is 'good' for a set of channels, it must be 'good' for a subset of these channels). Therefore, the monotone 
set sequences for the polar construction is provided and the parallel polar scheme in Section JIl] can be applied. 
The rate of the resulting scheme is given by summing over the rates in (1281 ) which adds to 



5-^ y y max B(P^). 

Since the last channel set Vs includes just a single channel Pg^, the compound setting is not required for this set. 
For the last set the information index set of the polar coding construction (in Section III-BI ) is therefore applied. 
The resulting rate of the parallel scheme is improved and given by (|27] ). ■ 

Remark 8 (Possible order of channels). The channel order may be an important parameter for the provided 
parallel scheme (in terms of achievable rates). The channels may be ordered by their capacity, where 

However, we have no evidence that this order results in the maximal achievable rate (or that it is optimal in any 
other sense). 

Remark 9 (An upper bound on parallel polar capacity). For each set Vs, s G [S], the upper bound in ||T2]| 
on the compound capacity can be applied to upper bound the size of the existing index sets An{Vs)- According 
to |[T2l Theorem 5], the resulting rate is upper bounded by 

4: V mill I(P^) 

cTe{o,iV ^ ' ' ' 
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for every A; G N, where l{PsJ is the capacity of the corresponding tree-channel P^.. As mentioned in Remark |7J 
the actual derivation in |12] is provided for two channels P and Q. Nevertheless, the arguments in ifTll are suitable 
for the case of S* > 2 channels. The proof of the considered upper bound is based on two major arguments. The 
first argument is a transformation of a channel to a sequence of sets of tree-channels (the same as in the lower 
bound). Then, for each such set, the maximal achievable rate is upper bounded by the minimal capacity of the 
channel capacities. Since for the last channel set, which is a set of a single channel, we have no compound setting 
(as explained in the proof of Corollary lU the maximal rate at which the parallel polar coding scheme proposed in 
Section JII] can operate reliably is given by 

C(P,J + 4 V V min UP^)- 

An example is provided in llT2ll . demonstrating that the concerned bound can be smaller than each of the channel 
capacities. Specifically, the example in \\2\ is based on a BSC with a crossover probability of 0.11002 and a BEC 
whose erasure probability is 0.5. Both of these channels corresponds to a capacity of 0.5 bits per channel use. 
However, as demonstrated in fill Example 6], their compound capacity is upper bounded by 0.482 bits per channel 
use. Consequently, if the parallel polar coding scheme in Section |lll] is applied for the same two channels, the 
possible rate of such a parallel coding scheme is upper bounded by 0.982 bits per channel use where the parallel 
capacity is given by 1 bit per channel use. 

B. Two capacity achieving schemes for non-degraded channels 

Consider the case where transmission takes place over a set of S binary-input, memoryless, output-symmetric 
channels {i-*s}f^]^. Since the channels are no longer degraded, the monotonicity property guaranteed in Corollary [J 
does no longer apply. Nevertheless, polarization of each one of the channels is still guaranteed. That is, information 
set sequences An \ s G [S"], satisfying the rate and performance properties in ^ and (fTTl) continue to exists. Two 
capacity achieving schemes are provided in this section. The first is based on interleaved binary polar codes, and 
the second is based on non-binary polarization. 

As in Section [nil MDS codes are used in the parallel coding scheme. Fix an integer ?n, > such that 2™ — 1 > 5. 
All MDS codes to be applied in the introduced coding scheme are defined over GF(2'"). We assume in the following 
that such MDS codes of block length S over GF(2'") are fixed and known both to the receiver and the transmitter, 
for every dimension d G [S]. These MDS codes are denoted by Q. 

Interleaved parallel polar coding scheme 

For the interleaved parallel polar coding scheme, m interleaved polar codes are applied for every channel Ps, 
s £ [S]. The m interleaved polar code of each channel P,, s G [S*], are defined based on the same information set 
sequence An . The encoding process is defined as follows: 



1) For every information index k G An , and every channel index s G [S]: 

^fc(m-l)+/' 



(s) 

a) Pick m information bits, denoted by u,/ _i-> , ,, 1 < I < m. 



b) Define a symbol ai over GF(2™), based on the binary length-m vector 

l"(fc-l)m+l' ■ ■ ■ ' "'{k-l)m+m)- 

c) For every k G [n], a length S codeword c^^^^ = {c\ , Cg , . . . , c^ ) over GF(2"*) is defined according 
to: 
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i) Setd= \{s: k e A^n^}\. 

ii) Choose the codeword c^^^^ € Q, satisfying c]., = a]., for every s' € {s : /c G ^n }■ Note that as 
Cd is an MDS code of dimension d, the codeword c^^^ is indeed completely determined by the d 
indices {s : A; G An }. 

2) For every index k An and every s G [S], define the binary vector 

(^^(A:-l)m+l' ^(A:-l)m+2' ' ' ' ' ^(fc-l)m+m j ^ i^' -*-/ 
(k) 

as the binary vector representation of the symbol Cg ■ 

3) Compute the m ■ S polar codewords xj^^ G {0, 1}", / G [m], s G [5"] according to 

„, _ /'„>) „,(«) „,(«) \ . p 

where G„ is the polar generator matrix. 

4) For every channel index s G [5], form a codeword x*^'') for transmission based on the concatenation 

X — V-^l,S) ^2,S) • • • ) ^m,s)- 

At the decoder, it is assumed that the concatenated codeword x'^*^'') is transmitted over the channel Pg, s G [S]. 
The first stage of the decoding process goes as follows: 

1) For every s G [S] such that 1 G An , the bits Ui , I G [m] can be decoded, based on the first step of the 
standard polar coding successive cancellation decoding procedure, for the m interleaved polar codes of the 
channel s. 

2) Setd^|{s: I e Ai'^}\. 

3) Find the codeword c = (ci,C2, . . . ,cs) in Cd such that for every s' (^ {s : 1 G An }, the symbol €7^(5') 
equals to the symbol in GF(2™) corresponding to the binary vector 

Note that the codeword c is completely determined by every d symbols. That is the decoding result does not 
depend on the actual permutation vr, applied during the block transmission. 

4) For every s' ^ {s : 1 G aI^^}, the bits 

^^U.]^ ,0.2 ) • • • ) "m, J 

are set to be the length-m binary vector representation of the symbol Ct^(^s') £ GF(2'"). 
Note, that after the first stage of the decoding process, all the bits u^ , s G [S*], / G [m] are decoded. Next, the 

m{k'-l)+l' 



(s) 

fe-th stage, 2 < fc < n, of the decoding process is described. It is assumed that the decoding of the bits u 

s G [S], I (^ [m] are decoded up to A;' < fc — 1. The decoding of the bits u^ij^^-^^.p s G [S"], / G [m] goes as 
follows: 

1) For every s G [S] such that k G An , the bits U(^_i)„^:p ^ G [^ can be decoded using m standard polar 
coding successive cancellation decoding procedures. These decoding procedures are based on the bits which 
were decoded in earlier decoding stage. That is, for a fixed /, I G [m], the bit tij^_^x^ , ^ is decoded based on 
the bits ^tffc/_i\m+i , k' ^[k — \\ using the standard polar coding successive cancellation decoding procedure 
for the polar code defined based on the index set sequence An . 

2) Setd^ \{s: k ^ A^n^}\. 
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3) Find the codeword c = (ci,C2, . . . ,cs) in Q such that for every s' G {s : /c G An }, the symbol €7^(5') 
equals to the symbol in GF(2™) corresponding to the length-?n, binary vector representation 

C,,(-(«')) ,.(-(«')) ,Ms'))\ 

l"'(fe-l)m+l' "'(fc-l)m+2' • • • ' "fcm ^• 

Note that this codeword is completely determined by every d symbols. That is the decoding result does not 
depend on the actual permutation vr, applied during the block transmission. 

4) For every s' ^ {s : k e An}, the bits 

l"(A:-l)m+l' "{fc-l)m+2' • • • ' "fcm J 

are set to be the binary vector representation of the symbol c„(^s') ^ GF(2™). 

Proposition 3. The provided interleaved parallel polar coding scheme achieves the parallel channel capacity. 

Proof: Since MDS codes of dimension d posses the property that every set of d symbols completely described a 
codeword, the performance of the provided decoding process does not depent on the actual transmission permutation. 
The fact that the resulting error probability approached zero is a direct consequence of the error performance of 
the channel polarization method. It remains to show that the coding rate approaches capacity. Note that for every 
channel, m interleaved polar codes of block length n axe applied. Hence, for a fixed n the transmission rate is 
given according to: 

^ m ■ aI'' 



E 



m ■ n 

s=l 



which, according to the polarization propertied in Q, approaches X]s=i ^s as n approached infinitely. ■ 

Parallel polar coding scheme based on non-binary channel polarization 

As an alternative to m interleaved binary polar codes for every channel, a single non-binary polar code can be 
applied. Non-binary polar code are studied in Q and Q. For the particular case were the size of the channel input 
alphabet is a power of a prime, an explicit construction is provided in |[5l in terms of an n x n generator polarization 
matrix G„ over GF(2™). As in the binary polarization method, non-binary polarization generates information-index 
set-sequence, for which the corresponding split channels approaches the perfect channels. These split channels 
allow for a corresponding polar successive cancellation decoding process, while keeping the fraction of information 
indices arbitrarily close to the channel capacity. 

In order to apply the non-binary polarization coding scheme, a new set of parallel channels {W^jf^^ is defined 
according to 

m 

Ws{y\x) ^\{Ps{y^\hi) 

i=l 

where y = (yi, . . . , y„^) G 3^^, x G GF(2"^), s G [S], and 

b(x) = (61 (x),... ,6m(x)) 

is the binary m-length vector representation of the symbol x. A coding scheme for the parallel channels Wg, s G [S] 
is equivalent to a coding scheme for the original binary parallel channels where the transmission of a symbol x 
over a channel Wg is replaced with m transmissions over the channel P^, s G [S*]. With some abuse of notations, 
the information index set sequence for each of the non-binary channels Ws, s G [S] is also denoted by An ■ The 
encoding for the parallel non-binary polarization scheme follows according to the following steps: 
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1) For every information index k G J\>n , and every channel index s G [S\. 

a) Pick m information bits. 

b) Denote by Os the symbol in GF(2"^) corresponding to these m information bits. 

2) For every A; E [n], a length S codeword c*^^) = {c{ , C2 , . . . , c^ ) over GF(2™) is defined according to: 

a) Set d= \{s : k ^ A^'^}\. 

b) Choose the codeword c^'^) € Cd, satisfying c\., = a\, for every s' ?i {s : k G An }. 

3) Compute S polar codewords x^, s G [S] according to 

x, = (c«,cf,...,ci"))-G„ (29) 

where G„ is the polar generator matrix and arithmetic is carried over GF(2™). 
The first stage of the decoding process is carried as follows: 

1) For every s € [S] such that 1 € An , the symbols cL^,-, can be decoded, based on the first step of the polar 
coding successive cancellation decoding procedure applied for corresponding non-binary channels Wg. 

2) Set d^\{s: 1 e Ai'^}\. 

3) Find the codeword c = (ci, C2, . . . , cs) in Q such that for every s' G {s : 1 G An } 

4) For every s' ^ {s : 1 G An }, decode the symbols c|^A according to (l30l) . 

Next, assume that the decoding process is complete up to step k — 1 where 2 < k < n. That is, for every k' G [A; — 1] 
the symbols ci , s G [S] are already decoded. The decoding of the symbols cl , s €^ [S], is carried as follows: 
1) For every s G [S] such that k G An , the symbols cK^-. can be decoded based on the channel observations 
and the former decoded symbols ci , s G [5] and k' G [k — 1]. The symbol cL^-, is decoded using the polar 

coding successive cancellation decoding procedure applied for corresponding non-binary channels Ws and 

k') 

r(.)' 



depends on the former decoded symbols cLA, A;' G [A; — 1]. 



2) Setd= \{s: A: G ^i'^lj. 

3) Find the codeword c = (ci, C2, . . . , cs) in Q such that for every s' G {s : A: G An } 

c!f(l)=c^(«')- (31) 

4) For every s' ^ {s : A; G An }, decode the symbols cL\ according to ( [3T] ). 

The same reasoning as in Proposition [3] shows that the non-binary scheme also achieves the capacity for the 
provided model. What is left to provide is a generalization of the symmetry results in 1 1 1 discussed in Remark [6] 
Specifically, it remains to show that the the error performance of the non-binary polar codes under successive 
cancellation is independent on the symbol vectors (ci , Cs , . . . , Cs ), s G [S], in (|29l ). This result is provided in 
the Appendix and follows along similar arguments as in UJ. 

V. Summery and Conclusions 

Parallel polar coding schemes are provided in this paper for communicating over a set of parallel binary-input 
memoryless and output-symmetric parallel channels. The provided coding schemes are based on the channel 
polarization method originally introduced by Arikan [1] for a single-channel setting. The first provided scheme 
is shown to achieve capacity for the particular case of stochastically degraded channels. For non-degraded parallel 
channels, upper and lower bounds on the achievable rates are derived for the provided scheme based on the 
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techniques in |[T2l . Two modifications of the parallel polar coding scheme are provided, which achieve the capacity 
of the general non-degraded case. 

The definition of polar codes includes a set of predetermined and fixed bits. These bits are crucial to the decoding 
process. In the original polarization scheme in [H, these predetermined and fixed bits may be chosen arbitrarily (in 
the case of symmetric channels). For the provided parallel coding schemes on the other hand, the predetermined 
and fixed bits are determined based on algebraic coding constraints. For the particular case of degraded channels, 
the information bits of channels determine the predetermined and fixed bits of their degraded counterparts. The 
MDS coding, suggested in this paper is similar to the rate-matching scheme in 1.13.1 . For the general non-degraded 
case, either interleaving of binary polar codes is used or non-binary channel polarization. The modification based 
on non-binary channel polarization is almost directly applicable for the case of non-binary parallel channels. 

The following topics are considered for further research: 

1) Symmetry condition: For symmetric channels, the predetermined and fixed bits may be chosen arbitrarily. 
For non-symmetric channels, good predetermined and fixed bits (called also frozen bits in [1]) are shown to 
exist, but their choice may not be arbitrary. It is an open question if there is a more general construction that 
does not require symmetry of the parallel channels. This may be accomplished using non-binary codes (the 
single channel case is addressed in some extent in |[8l). 

2) Generalized parallel polar coding as in JQll- llTTl . 

3) Generalized channel models. Arbitrarily-permuted channels is just one particularization of the compound 
setting. It is of great interest to enlarge the family of parallel channels for which the studied coding scheme 
may apply. Of specific interest is the case of parallel channels were a sum-rate constraint is provided by the 
channel model characterization. 
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Appendix 

In this appendix we show that the performance of non-binary polarization provided in fSl and Q under successive 
cancellation decoding is independent on the input vectors (which includes both information and predetermined and 
fixed symbols). The applied proof techniques goes along a similar steps as in the binary case provided in yj. We 
consider a polarization scheme where transmission takes place over a DMC whose input alphabet A" is a finite field. 
It is assume that the polarization scheme can be defined according to ^, where all operations are carried over the 
considered finite field, in order to achieve message independence property, we relay on the following symmetry 
definition for the non-binary case: 

Definition 6 (Non-binary symmetry). A memoryless channel which is characterized by a transition probability p, 
an input-output alphabet X and a discrete output alphabet y is symmetric if there exists a function T : y x X ^ y 
which satisfies the following properties: 

1) For every x € <Y, the function T(-,x) : y ^ y is bijective. 

2) For every xi,X2 € -^ and y £y, the following equality holds: 

P{y\xi} = p{T{y, X2-Xi)\x2}. 
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Let p be a symmetric DMC with an input alphabet X and output alphabet y. In addition, let T be the 
corresponding function in Definition [6l With abuse of notation, the operation of T on vectors y € 3^" and x € X^ 
is carried according to 

7'(y,x) = (T(yi,xi), T{y2,x2), . . . ,T{yn,Xn)). 

Subtraction of a vector is also defined item-wise, that is —{xi, . . . , Xn) = {—xi, . . . Xn)- 

The polar successive cancellation decoding is accomplished based on decision made according to the split channel 
output probabilities. For the case of non-binary polarization, the corresponding split channels are defined according 
to 

Pn^(y,w|x) = — — J Y^ p„(y|(w,x,c)), / e [n] (32) 

where y G 3^", w € A''^^, and x ^ X. Note that this definition transforms to the binary base in (jT]) for binary 
input alphabets. The eiTor event under successive cancellation is a subset of the following union 



U U^^ 

d€X\{0}ieA 



where A is the set of indices of split channels which polarizes to perfect channels and 

£f = {(w,y) GX^xy^: p«(y, {wi, . . . ,Wi.i)\w,) < p« (y, {wi,. . . ,Wi.i)\wi + d)]. (33) 

On the other hand, non-binary channel polarization guarantees that there a symbol set {wi}i^[n]\A ^uch that the 
probability of the event Sf approaches zero for every d G X \ {0} and i £ A. 

The following lemma assures that for symmetric channels the events Sf, i G A and d G A'\{0}, are independent 
with the input vector w in ^. Consequently, for symmetric channels the error performance guaranteed by non- 
binary channel polarization in 151 is provided no matter what are the symbols chosen for {wi}i(z[„]\^. 

Lemma 2 (Message independence property for non-binary symmetric-channel polarization). Denote by 
Pe{£f\u) the probability of the event £f in (1331 ). assuming that vi^ = u in ^. Then, 

P,{£t\u) = P,i£f\0) 

for every u inX"-, where is the all zero vector in X"-. 

Proof: Based on the symmetry property of that channel, for every i £ [n], y G y", w G X^"^, w € X and 
a G Af", we have 

Pn\y,{wi,---,Wi-i)\Wi) 

(a) 

' ' c£X"-' t=l 

(b) 



\X 



' ceX"-' t=i 

^ E np(r(yi,(aG„)^)|((w,x,c)G„)^+(aG„)J 



where (x)t denotes the t-th element of a vector x = (xi, . . . ,Xn), (a) follows for memoryless channels from ^ 
and (l32l) and (b) follows from the symmetry property of the channel. Consequently, it follows that 

p^^\y,{wi,...,Wi-i)\wi) = p^^H'T{y^^Gn,{wi, . . . ,Wi-i) + {ai,...,ai-i))\wi + aiy (34) 
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From, ([33]) and ( [34b . it follows for every pair (w, y) € X"- x y"- and every a € -Y" that 

(w, y) G ^/ ^^ (a + w, r(y, a • G„)) E Sf. (35) 

Next, let l£;d (u, y) denote the indicator of the event Sf. For every u € A"* it follows that 

= XI ^ri(y|u)l^<i(u,y) 

= E P(r(y,-uG„)|0)l£.(0,r(y,-uG„)) 

= Yl Pn(y|0)lff(0,y) 
yey- 

= Pe{£t\0) 

where (a) follows from ([5]l, (b) follows from ([35] | by plugging a = u, and (c) follows since T{y, x) is a bijective 
function oi y G y for every fixed symbol x £ X. ■ 
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